A Very Intro Guide to COS Data

In [1]:
# Import the necessary libraries
from astroquery.mast import Observations # For actually searching and downloading
from pathlib import Path # Handling system paths
# For reading and editing fits astropy table format files
from astropy.table import Table
# For reading and editing general fits files
from astropy.io import fits
# For dealing with units and unit conversions
from astropy import units as u
# Plotting
import matplotlib.pyplot as plt
%matplotlib inline
import plotly.express as px
import plotly.graph_objects as go

Start by downloading some data

  • FUV
    • Proposal 15366
    • Andy might recognize this proposal as his COS LP4 Spectral Resolution Program!
  • NUV
    • Picked at random

We begin by downloading the final spectrum product (X1DSUM) of an FUV observation.

In [2]:
# Download an example FUV dataset using astroquery:
## Find all the observations from a single HST Proposal
obs_from_proposal = Observations.query_criteria(
    proposal_id="15366" # The Proposal ID of the observations to download
)
## Find all the data products for these observations
products_from_proposal = Observations.get_product_list(
    obs_from_proposal
)
print(f"Found {len(products_from_proposal)} total data products")
## Filter to a specific subset, i.e. of filetypes
products_to_download = Observations.filter_products(
    products_from_proposal,
    productSubGroupDescription=["X1DSUM","ASN"] # Filters to only the X1DSUM and ASN files
)
# Download the FUV products
download_table = Observations.download_products(products_to_download)
# Make a list of the local paths, aggregated by type of file
fuv_x1dsum_products = [Path(local_path) for local_path in download_table["Local Path"] if "x1dsum" in local_path]
fuv_asn_products = [Path(local_path) for local_path in download_table["Local Path"] if "asn" in local_path]
print("FUV X1DSUM Files: \n", fuv_x1dsum_products, "\nFUV ASN Files: \n", fuv_asn_products)
Found 266 total data products
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldm701010_asn.fits to ./mastDownload/HST/ldm701010/ldm701010_asn.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldm701010_x1dsum.fits to ./mastDownload/HST/ldm701010/ldm701010_x1dsum.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldm701020_asn.fits to ./mastDownload/HST/ldm701020/ldm701020_asn.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldm701020_x1dsum.fits to ./mastDownload/HST/ldm701020/ldm701020_x1dsum.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldm701030_asn.fits to ./mastDownload/HST/ldm701030/ldm701030_asn.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldm701030_x1dsum.fits to ./mastDownload/HST/ldm701030/ldm701030_x1dsum.fits ... [Done]
FUV X1DSUM Files: 
 [PosixPath('mastDownload/HST/ldm701010/ldm701010_x1dsum.fits'), PosixPath('mastDownload/HST/ldm701020/ldm701020_x1dsum.fits'), PosixPath('mastDownload/HST/ldm701030/ldm701030_x1dsum.fits')] 
FUV ASN Files: 
 [PosixPath('mastDownload/HST/ldm701010/ldm701010_asn.fits'), PosixPath('mastDownload/HST/ldm701020/ldm701020_asn.fits'), PosixPath('mastDownload/HST/ldm701030/ldm701030_asn.fits')]

We also want to download an example of the raw data (RAWTAG), and the intermediate data step (CORRTAG)

Note that we're just condensing the above functions into a more compact form. It's the same content as above.

Also note, this is not all the data which went into the final products downloaded above. It corresponds to a single exposure on a single segment of the FUV detector.

In [3]:
rawtag_a = Path(Observations.download_products(
    Observations.filter_products(
        Observations.get_product_list(
            Observations.query_criteria(
                proposal_id="15366"
            )
        ),
        productSubGroupDescription=["RAWTAG_A"]
    )[0]
)["Local Path"][0])

corrtag_a = Path(Observations.download_products(
    Observations.filter_products(
        Observations.get_product_list(
            Observations.query_criteria(
                proposal_id="15366"
            )
        ),
        productSubGroupDescription=["CORRTAG_A"]
    )[0]
)["Local Path"][0])

print(f"\n\nRaw TIME-TAG data from segment A in: {rawtag_a}")
print(f"Corrected TIME-TAG data from segment A in: {corrtag_a}")
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldm701qnq_rawtag_a.fits to ./mastDownload/HST/ldm701qnq/ldm701qnq_rawtag_a.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/ldm701qnq_corrtag_a.fits to ./mastDownload/HST/ldm701qnq/ldm701qnq_corrtag_a.fits ... [Done]


Raw TIME-TAG data from segment A in: mastDownload/HST/ldm701qnq/ldm701qnq_rawtag_a.fits
Corrected TIME-TAG data from segment A in: mastDownload/HST/ldm701qnq/ldm701qnq_corrtag_a.fits

Finally, let's download an example NUV dataset using astroquery in this more condensed form

In [4]:
nuv_x1dsum_g230l = Path(Observations.download_products(
    Observations.filter_products(
        Observations.get_product_list(
            Observations.query_criteria(
                obs_id = 'lbbd01020'
            )
        ),
        productSubGroupDescription=["X1DSUM"]
    )[0]
)["Local Path"][0])

nuv_x1dsum_g185m = Path(Observations.download_products(
    Observations.filter_products(
        Observations.get_product_list(
            Observations.query_criteria(
                obs_id = 'LAAD020L0'
            )
        ),
        productSubGroupDescription=["X1DSUM"]
    )[0]
)["Local Path"][0])
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/lbbd01020_x1dsum.fits to ./mastDownload/HST/lbbd01020/lbbd01020_x1dsum.fits ... [Done]
Downloading URL https://mast.stsci.edu/api/v0.1/Download/file?uri=mast:HST/product/laad020l0_x1dsum.fits to ./mastDownload/HST/laad020l0/laad020l0_x1dsum.fits ... [Done]

Examining the data products

The RAW and CORR Data:

In [5]:
hdr0 = fits.getheader(rawtag_a)
hdr1 = fits.getheader(rawtag_a, ext=1)
print(f"This is a {hdr1['EXPTIME']} second exposure on segment {hdr0['SEGMENT']} with the {hdr0['OPT_ELEM']} grating and the {hdr0['CENTRWV']} â„« central wavelength setting.")

# My favorite way to read in FITS data - there are others!
raw_data = Table.read(rawtag_a, hdu=1)
corr_data = Table.read(corrtag_a, hdu=1)
This is a 403.0 second exposure on segment FUVA with the G130M grating and the 1222.0 â„« central wavelength setting.

We can look at a few of the counts to see how the data table is formatted.

  • First the RAW data
  • Second the CORR data
    • Here th XFULL and YFULL are the fully corrected locations on the detector.
In [6]:
raw_data[:10]
Out[6]:
Table length=10
TIMERAWXRAWYPHA
secondspixelspixels
float32int16int16uint8
0.01246341111
0.049783919
0.030364059
0.058084227
0.042254037
0.01206440611
0.0879440620
0.01208443010
0.0150204125
0.043494258
In [7]:
corr_data[:10]
Out[7]:
Table length=10
TIMERAWXRAWYXCORRYCORRXDOPPXFULLYFULLWAVELENGTHEPSILONDQPHA
spixpixpixpixpixpixpixAngstrom
float32int16int16float32float32float32float32float32float32float32int16uint8
0.01246341112371.331415.6339412373.06312883.342415.799651339.40311.058962011
0.049783914968.0923407.571324969.7295480.0073404.907781265.64781.07915209
0.030364053024.64419.70193026.25173536.5303417.46211246.28611.154434309
0.058084225795.44433.928285797.08746307.3657432.169981273.89041.065008907
0.042254034208.327418.841984209.9544720.2324416.360051258.07861.115729507
0.01206440611983.683411.5204811985.4112495.688411.73331335.54101.0555439011
0.087944068754.496409.54438756.1829266.46410.131041303.37001.3491015420
0.01208443011999.595436.3828712001.32212511.601436.6851335.69961.056003010
0.01502041214917.777413.866114919.54215429.82415.665071364.77221.123113605
0.043494254335.36436.681924336.9894847.267434.291871259.34411.109477208

Let's take a look at how the counts fell on the detector.

In [8]:
# So this runs quickly, limit to first 80k counts
fig = go.Figure()
fig.add_trace(
    go.Scattergl(
        x=raw_data["RAWX"][:80000].astype(float),
        y=raw_data["RAWY"][:80000].astype(float),
        mode="markers",
        name="RAWTAG Data"
    )
)
fig.add_trace(
    go.Scattergl(
    x=corr_data["XFULL"][:80000].astype(float),
    y=corr_data["YFULL"][:80000].astype(float),
        mode="markers",
        name="CORRTAG Data"
    )
)

fig.update_layout(
    title="Looking at the raw and corrected counts",
    xaxis_title="Detector X Coordinates",
    yaxis_title="Detector Y Coordinates",
    legend_title="Data type"
)

The power of this type of "TIME-TAG"data is that we have more control and information on each individual photon encounter. Because of this we have the ability to...

  • Remove suspected noise
  • Filter out "bad" time intervals
  • Split apart our data and examine it at different times

Illustrating the above processes

Now let's examine the FUV spectrum

In [9]:
for fuv_x1d in fuv_x1dsum_products:
    hdr0 = fits.getheader(fuv_x1d)
    hdr1 = fits.getheader(fuv_x1d, ext=1)
    print(f"{hdr0['ASN_ID']} is a {hdr1['EXPTIME']} second (combined) exposure with the {hdr0['OPT_ELEM']} grating and the {hdr0['CENTRWV']} â„« central wavelength setting.")
LDM701010 is a 1552.064 second (combined) exposure with the G130M grating and the 1216.296022718659 â„« central wavelength setting.
LDM701020 is a 2070.048 second (combined) exposure with the G160M grating and the 1568.559555114382 â„« central wavelength setting.
LDM701030 is a 2070.048 second (combined) exposure with the G160M grating and the 1615.750953189973 â„« central wavelength setting.

We'll look at the structure of the first (G130M) spectrum:

  • We note there are two rows for each of the segments of the detector
    • If we want to plot, we'll need to deal with each of these segments separately
  • For NUV data, there are three rows for each "Data stripe"
In [10]:
fuv_g130m = Table.read(fuv_x1dsum_products[0])
fuv_g130m
WARNING: UnitsWarning: 'erg /s /cm**2 /angstrom' contains multiple slashes, which is discouraged by the FITS standard [astropy.units.format.generic]
Out[10]:
Table length=2
SEGMENTEXPTIMENELEMWAVELENGTH [16384]FLUX [16384]ERROR [16384]ERROR_LOWER [16384]GROSS [16384]GCOUNTS [16384]VARIANCE_FLAT [16384]VARIANCE_COUNTS [16384]VARIANCE_BKG [16384]NET [16384]BACKGROUND [16384]DQ [16384]DQ_WGT [16384]
sAngstromerg / (Angstrom cm2 s)erg / (Angstrom cm2 s)ct / sct / sctctctctct / sct / s
bytes4float64int32float64float32float32float32float32float32float32float32float32float32float32int16float32
FUVA1552.064163841211.0534905949228 .. 1374.26806389723080.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.0128 .. 1280.0 .. 0.0
FUVB1552.064163841058.054288663829 .. 1221.07520978127950.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.0128 .. 1280.0 .. 0.0
In [11]:
# Set up the plot as a single box with size of 10x4 inches, and with a dpi of 100, relevant should we choose to save it:
fig1, ax = plt.subplots(1,1,figsize=(10,4), dpi = 200)  

# The next few lines are the core of the cell, where we actually place the data onto the plot:
###############
# Access each row of data, the longer wvln segment and gets the data we need to plot a spectrum:
for i, row in enumerate(fuv_g130m):
    wvln, flux, segment  = row["WAVELENGTH", "FLUX", "SEGMENT"] 
    ax.plot(wvln, flux, # First two arguments are assumed to be the x-data, y-data
            linestyle = "-", linewidth = 0.25, c = "kb"[i], # These parameters specify the look of the connecting line
            marker = '.', markersize = 1, markerfacecolor = 'r', markeredgewidth = 0, # The marker parameters specify how the data points will look... 
                                                                                    # ... if you don't want dots set marker = ''
            label = segment) # The label is an optional parameter which will allow us to create a legend 
                            # this label is useful when there are multiple datasets on the same plot
# The lines after this are all about formatting, adding text, and saving as an image
###############
ax.set_title("G130M COS Spectrum", size = 20) # Adds a title of fontsize 20 points
ax.set_xlabel('Wavelength [$\AA$]', size = 12) # Adds x axis label
ax.set_ylabel('Flux [$erg\ s^{-1}\ cm^{-2}\ Angstrom^{-1}$]', size = 12) # Adds y label

plt.legend(loc = 'upper right') # Adds a legend with the label specified in the plotting call
plt.tight_layout() # Trims blank space
plt.show() # Shows all the plot calls in this cell and "clears" the plotting space - must come after any saving you want to do
In [12]:
# Set up the plot as a single box with size of 10x4 inches, and with a dpi of 100, relevant should we choose to save it:
fig1, ax = plt.subplots(1,1,figsize=(10,4), dpi = 100)  

# The next few lines are the core of the cell, where we actually place the data onto the plot:
###############
for j, fuv_file in enumerate(fuv_x1dsum_products):
    fuv_tab = Table.read(fuv_file)
    data_label = f"{fits.getval(fuv_file, 'OPT_ELEM')}/{fits.getval(fuv_file, 'CENTRWV'):.0f} "
    for i, row in enumerate(fuv_tab):
        wvln, flux, segment  = row["WAVELENGTH", "FLUX", "SEGMENT"] 
        ax.plot(wvln, flux, # First two arguments are assumed to be the x-data, y-data
                linestyle = "-", linewidth = 0.5, c = ["bg","mk","yc"][j][i], # These parameters specify the look of the connecting line
                marker = '', markersize = 2, markeredgewidth = 0, # The marker parameters specify how the data points will look... 
                label = data_label+segment, alpha=1-0.1*j)
###############
ax.set_title("G130M+G160M COS Spectrum", size = 20) # Adds a title of fontsize 20 points
ax.set_xlabel('Wavelength [$\AA$]', size = 12) # Adds x axis label
ax.set_ylabel('Flux [$erg\ s^{-1}\ cm^{-2}\ Angstrom^{-1}$]', size = 12) # Adds y label

plt.legend(loc = 'upper right') # Adds a legend with the label specified in the plotting call
plt.tight_layout() # Trims blank space
plt.show() # Shows all the plot calls in this cell and "clears" the plotting space - must come after any saving you want to do
WARNING: UnitsWarning: 'erg /s /cm**2 /angstrom' contains multiple slashes, which is discouraged by the FITS standard [astropy.units.format.generic]
WARNING: UnitsWarning: 'erg /s /cm**2 /angstrom' contains multiple slashes, which is discouraged by the FITS standard [astropy.units.format.generic]
WARNING: UnitsWarning: 'erg /s /cm**2 /angstrom' contains multiple slashes, which is discouraged by the FITS standard [astropy.units.format.generic]

Finally let's look at some NUV spectra

We'll show the structure of the data, then make some plots

In [13]:
nuv_tab = Table.read(nuv_x1dsum_g185m)
nuv_tab
WARNING: UnitsWarning: 'erg /s /cm**2 /angstrom' contains multiple slashes, which is discouraged by the FITS standard [astropy.units.format.generic]
WARNING: UnitsWarning: 'count /s /pixel' contains multiple slashes, which is discouraged by the FITS standard [astropy.units.format.generic]
Out[13]:
Table length=3
SEGMENTEXPTIMENELEMWAVELENGTH [1274]FLUX [1274]ERROR [1274]ERROR_LOWER [1274]VARIANCE_FLAT [1274]VARIANCE_COUNTS [1274]VARIANCE_BKG [1274]GROSS [1274]GCOUNTS [1274]NET [1274]BACKGROUND [1274]DQ [1274]DQ_WGT [1274]BACKGROUND_PER_PIXEL [1274]
sAngstromerg / (Angstrom cm2 s)erg / (Angstrom cm2 s)erg / (Angstrom cm2 s)ct / sctct / sct / sct / (pix s)
bytes4float64int32float64float32float32float32float32float32float32float32float32float32float32int16float32float32
NUVA95.04012741773.4769261304905 .. 1821.07837834562040.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.0128 .. 1280.0 .. 0.00.0 .. 0.0
NUVB95.04012741876.9959138961574 .. 1923.38799599787580.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.0128 .. 1280.0 .. 0.00.0 .. 0.0
NUVC95.04012741982.5018025396641 .. 2027.44135455539190.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.00.0 .. 0.0128 .. 1280.0 .. 0.00.0 .. 0.0

Both spectra are of white dwarf stars

  1. First the G185M Grating spectrum of GD71
  2. Second the G230L Grating spectrum of WD1057+719
    1. This is a somewhat strange spectrum COS creates with the G230L grating. The first two stripes are (like all other COS spectra) the first order spectrum of the source. The third stripe is dominated by second order light. It's thus more dispersed, has lower S/N, and falls on top of the first (NUVA) stripe.
In [14]:
nuv_tab = Table.read(nuv_x1dsum_g185m)
# Set up the plot as a single box with size of 10x4 inches, and with a dpi of 100, relevant should we choose to save it:
fig1, ax = plt.subplots(1,1,figsize=(10,4), dpi = 100)  

# The next few lines are the core of the cell, where we actually place the data onto the plot:
###############
# Access each row of data, the longer wvln segment and gets the data we need to plot a spectrum:
for i, row in enumerate(nuv_tab):
    wvln, flux, segment  = row["WAVELENGTH", "FLUX", "SEGMENT"] 
    ax.plot(wvln, flux, # First two arguments are assumed to be the x-data, y-data
            linestyle = "-", linewidth = 0.25, c = "kmb"[i], # These parameters specify the look of the connecting line
            marker = '.', markersize = 2, markerfacecolor = 'r', markeredgewidth = 0, # The marker parameters specify how the data points will look... 
            label = segment)
###############
ax.set_title("G185M NUV COS Spectrum", size = 20) # Adds a title of fontsize 20 points
ax.set_xlabel('Wavelength [$\AA$]', size = 12) # Adds x axis label
ax.set_ylabel('Flux [$erg\ s^{-1}\ cm^{-2}\ Angstrom^{-1}$]', size = 12) # Adds y label

plt.legend(loc = 'upper right') # Adds a legend with the label specified in the plotting call
plt.tight_layout() # Trims blank space
plt.show() # Shows all the plot calls in this cell and "clears" the plotting space - must come after any saving you want to do

This rather atypical NUV spectrum contains both 1st and 2nd order light from the white dwarf source.

In [15]:
nuv_tab = Table.read(nuv_x1dsum_g230l)
# Set up the plot as a single box with size of 10x4 inches, and with a dpi of 100, relevant should we choose to save it:
fig1, ax = plt.subplots(1,1,figsize=(10,4), dpi = 100)  

# The next few lines are the core of the cell, where we actually place the data onto the plot:
###############
# Access each row of data, the longer wvln segment and gets the data we need to plot a spectrum:
for i, row in enumerate(nuv_tab):
    wvln, flux, segment  = row["WAVELENGTH", "FLUX", "SEGMENT"] 
    ax.plot(wvln, flux, # First two arguments are assumed to be the x-data, y-data
            linestyle = "-", linewidth = 0.25, c = "kmb"[i], # These parameters specify the look of the connecting line
            marker = '.', markersize = 2, markerfacecolor = (1,0,0,0.5), markeredgewidth = 0, # The marker parameters specify how the data points will look... 
            label = segment)
###############
ax.set_title("G230L NUV COS Spectrum", size = 20) # Adds a title of fontsize 20 points
ax.set_xlabel('Wavelength [$\AA$]', size = 12) # Adds x axis label
ax.set_ylabel('Flux [$erg\ s^{-1}\ cm^{-2}\ Angstrom^{-1}$]', size = 12) # Adds y label

plt.legend(loc = 'upper right') # Adds a legend with the label specified in the plotting call
plt.tight_layout() # Trims blank space
plt.show() # Shows all the plot calls in this cell and "clears" the plotting space - must come after any saving you want to do
WARNING: UnitsWarning: 'erg /s /cm**2 /angstrom' contains multiple slashes, which is discouraged by the FITS standard [astropy.units.format.generic]
WARNING: UnitsWarning: 'count /s /pixel' contains multiple slashes, which is discouraged by the FITS standard [astropy.units.format.generic]